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ABSTRACT 


Let  (X-j  ,Yj),...,(X  ,Y  )  be  iid  rv's  with  pdf  f(x,y)  and  let  m(x)  = 

E ( Y | X  =  x)  =  Jyf(x,y)dy/f^(x)  be  the  regression  function  of  Y  on  X.  The 

_  I  n 

function  m(x)  is  estimated  by  m  (x)  a  solution  of  (nh)"  £  K((x-X.)/h)y(Y.-*) 

n  i=l  1  1 

for  some  odd  and  bounded  '{'-function  making  m  (x)  a  robust  estimate  of  m(x). 

\ 

> Probabilities  of  maximal  deviation  of  Jm  (x)  -  m(x)|  are  computed  in  a 
similar  way  as  in  Bickel  and  Rosenblatt  (1973)  for  density  estimation  and 
in  Johnston  (19'82)  for  nonparametric  regression  function  estimation. 

\ 


1.  BACKGROUND  AND  INTRODUCTION 


Nadaraya  (1964)  and  Watson  (1964)  independently  proposed  the  following 
kernel  estimator 


of  the  regression  function  m(x)  =  Jy  f(x,y)dy/f^(x)  where  f^(x)  denotes  the 
marginal  density  of  X,K  v«)  is  a  kernel  and  {hnl  is  a  sequence  of  positive 
constants  ("bandwidth").  Basically  this  estimator  averages  the  Y's  around 
X  =  x  motivated  from  the  integral  formula  for  m(x)  above.  The  numerator  is 
a  weighted  local  average  of  the  Y's  while  the  denominator  is  a  density  esti¬ 
mate  of  fx(x) ■ 

It  is  clear  that  occasional  outliers  generated  by  heavy  tailed  condi¬ 
tional  densities  f(y|x)  introduce  smooth  peaks  and  troughs  in  the  estimated 
curve  m*(x).  Such  outliers  occur  quite  often  in  practice.  (Ruppert  et  al . , 

1982  Figure  7  or  Bussian  et  al . ,  1982).  To  avoid  this  misleading  property 
of  m*(x)  due  to  spiky  Y-observations  we  introduce  a  robust  estimate,  the 
M-smoother,  mn(x)  as  the  solution  of 

(1.2)  (nh  J"1  l  K( (x-X.  )/h  )T(Y.-  •)  =  0, 

n  i=l  1  n  1 

where  V  denotes  a  bounded,  odd  and  continuous  function.  Note  that  if  H'(u)  = 

u,  then  iii  is  the  Nadaraya-Watson  estimator  m*  .  Bias  and  variance  rates  for 
n  n 

mn(x)  with  K  as  the  uniform  window  where  obtained  by  Stuetzle  and  Mittal  (1979), 
robustness  properties,  consistency  and  asymptotic  normality  of  mn(x)  were 
considered  by  Hardle  (1982).  For  the  case  of  nonrandom  design,  i.e.  Xi  attains 
fixed  values,  we  may  refer  to  Hardle  and  Gasser  (1982).  In  this  paper  we  show  that 
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O-3)  Pf(2o  log  n)^  sup  |(m  (t)  -  m(t) )  *r( t)  |/A (K)*3-  dn]  -  x( 

0<t<l  n  n 

•  exp(-2  exp(-x))  , 

where  <v ,  r(t),  A(K),  dn  are  suitable  scaling  parameters. 

The  result  (1.3)  improves  upon  that  of  Johnston  (1982)  in  a  number  of 
ways.  First,  Johnston  obtains  results  like  (1.3),  but  for  estimates  different 
from  the  Nadaraya-Watson  estimator  (1.1);  our  result  (1.3)  of  course  applies 
to  the  Nadaraya-Watson  estimator  as  a  special  case.  Secondly,  (1.3)  holds  for 
a  much  broader  class  of  estimators.  Finally,  we  obtain  (1.3)  under  assumptions 
weaker  than  those  needed  by  Johnston. 

2.  ASSUMPTIONS  AND  RESULTS 

We  write  h  for  the  bandwidth  hn  from  here  on  unless  there  is  no  need  to 
do  so.  We  make  use  of  the  following  assumptions. 

(Al)  the  kernel  K(-)  is  positive  has  compact  support  [-A,A]  and  is 
continuously  differentiable. 

(A2)  (nh)  2(log  n)3//^  +  0  (n  log  n)  2  0 

( nh3 )  1  ( 1  og  n)3  <  M,  fl  a  constant  . 

(A3)  h  3(log  n)  /  f  (y)dy  =  0(1),  f  (y)  the  marginal  density 

|y|>  an  y  y 

of  Y,  1  an ^n_-|  a  sequence  of  constants  tending  to  infinity  as  n  •  >■. 

(A4)  inf  | q ( t )  |  >  q0  >  0,  where  q(t)  =  E(H'' ( Y-m(t) )  |X=t)-fv(t) 

Otl  x 

(A5)  the  regression  function  m(x)  is  twice  continuously  differentiable,  the 
conditional  densities  f(y|x)  are  symmetric  for  all  x,  V  is  piecewise 
twice  continuously  differentiable. 

We  need  some  more  definitions  before  we  discuss  the  assumptions. 
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Defi'ne 

o2(t)  =  E(Y2(Y-m(t))|X=t) 

H  (t)  =  (nh)'1  l  K( (x-X. )/h)?(Y . -m(t) ) 
n  i=l  1  1 

D  (t)  =  (nh)'1  I  K((x-X.)/h)r(Y.-m(t)). 
n  i=l  1  1 

We  further  assume  that  a2(t)  and  f^(t)  are  differentiable. 

Assumption  (Al)  on  the  compact  support  of  the  kernel  could  possibly  be 

relaxed  introducing  a  cutoff  technique  as  Csorgo  and  Hall  (1982)  for  density 

estimators.  Assumption  (A2)  has  purely  technical  reasons:  to  keep  the  bias  down 

and  to  ensure  the  vanishing  of  the  nonlinear  remainder  terms.  Assumption 

(A3)  appears  in  a  somewhat  modified  form  also  in  Johnston's  paper  (1982). 

When  we  want  to  apply  the  following  theorem  to  the  Nadaraya-Watson  estimator 

m*(x)  we  have  actually  to  restate  (A2)  as  h  (log  n  )  f  y  f  (y)dy  (which 

n  |y|>an  y 

is  assumption  Al  in  Johnston  (1982)).  Assumption  (A5)  stating  the  symmetry 
of  the  conditional  densities  is  common  in  robustness  considerations  (Huber, 
1981).  It  guarantees  that  the  only  solution  of  /H7 (y-*)f(y |x) dy  =  0  is  m(x)  = 

E ( Y | X=x) .  If  we  had  skew  distributions  then  we  would  no  longer  estimate  the 
conditional  mean  but  rather  a  conditional  quantile  such  as  the  median. 


Theorem 


Let  h  =  n  ,  1/5  <  6  <  1/3  and  A ( K)  =  /lC(u)du  and 

-A 

dn  =  (26  log  n)*2  +  (26  log  n)"Js{log(c1  ( Kj/n*5)  +  ^[log  6  +  log  log  n]}  , 

if  c] (K)  =  K2(A)  +  K2(-A)/[2A(K)]  >  0 

dn  =  (26  log  n)*5  +  (26  log  n)'*5  (log  (c2(K)/2n) > 

A  ? 

otherwise  with  c«(K)  =  J[K' (u)]^du/[2A(K)]  . 
c  -A 


Then  (1 .3)  holds  with 
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r(t)  =  (nh)\(t)[c  2(t)fx(t)]“^  . 

This  theorem  can  be  used  to  construct  uniform  confidence  intervals  for  the 
regression  function  as  stated  in  the  following  corollary. 

Corol 1  ary :  Assuming  the  theorem  above  holds,  an  approximate  (l-i)x  100: 
confidence  band  over  [0,1]  is 

mn(t)  ±  (nh)"i5[>'2(t)fx(t)A(K)]ij  q'1  ( t) [dn+c(0 (2.s  log  n)'^2]  •  [>.(K)]'J 
where  c(u)  =  log  2  -  log | 1 og ( 1 -a) | . 

The  proof  is  essentially  based  on  a  linearization  argument  due  to  Taylor  series 
expansion.  The  leading  linear  term  will  then  be  approximated  in  a  similar  way 
as  in  Johnston  (1982),  Bickel  and  Rosenblatt  (1973).  The  main  idea  behind  the 
proof  is  a  strong  approximation  of  the  empirical  process  of  {(X^  ,Y.)}*?_^  by  a 
sequence  of  Brownian  bridges  (with  two  dimensional  time)  as  provided  Tusnady 
(1977). 

It  follows  by  Taylor  expansions  applied  to  the  defininq  equation  (1.2)  that 
(2.D  mn(t)  -  m(t)  =  (Hn(t)-EHn(t))/q(t)  +  Rn(t) 

where  [Hn(t)-EHn(t)]/q(t)  is  the  leading  linear  term  and 

(2.2)  Rn(t)  =  Hn(t)[q(t)-Dn(t)]/[Dn(t)-q(t)]  +  EHn(t)/q(t) 

+  Js(mn(t)-m(t))2  •  [Dn(t)l~1  -  (nh)"\j-  K((x-X.)/h)H'"(Yrm(t)+rn(l)(t): 

l^^(t)|  -  |mn(t)-m(t)  1 . 


is  the  remainder  term.  In  the  third  section  it  is  shown  (Lemma  3.1)  that 


R  11=  sup  |  R  ( t )  |  =  o  ( (nh  log  n)~^) . 


0<t<l 


Furthermore  the  rescaled  linear  part 


Yn(t)  =  (nh)V(t)fx(t)]"!5(Hn(t)  -  EHn(t)) 


is  approximated  by  a  sequence  of  Gaussian  processes,  leading  finally  to  the 
following  process 

Yr  (t)  =  h'^  /K ( (t-x)/h)  dW ( x ) , 

3)11 

as  in  Bickel  and  Rosenblatt  (1973). 

We  also  need  the  Rosenblatt  transformation  (Rosenblatt,  1952). 


T(x,y)  =  (FXly(xly)>  Fy(y)) 

which  transforms  (X.  ,Y^)  into  T(Xi,Yi)  =  ( X ^ , Y *- )  mutually  independent  uniform 
rv's.  with  the  aid  of  this  transformation  Theorem  1  of  TusnSdy  (1977)  may  be 
applied  to  obtain  the  following  lemma. 

Lemma  2.1:  On  a  suitable  probability  space  there  exists  a  sequence  of 


Brownian  bridges  Bn  such  that 

sup j Z  (x,y)-B  (T(x,y)|  =  0(n'i'2(log  n)2)  a.s., 

x,y 

where  Zn(x,y)  =  n a[Fn(x,y)-F(x,y)]  denotes  the  empirical  process  of  { ( Xi  ,Y_. )  =  1 . 


Before  we  define  the  different  approximating  processes  let  us  first  rewrite 
Yn(t)  as  a  stochastic  integral  with  respect  to  the  empirical  process  Zn(x,y). 
Yn(t)  =  h~'2g'  (t)"2//K((t-x)/h)T(y-m(t))dZ  (x,y) ,  o’(t)  =  oc(t)  fv(t). 

■  I  n  a 


The  approximating  processes  are  now 

Yo,n(t)  =  (hg(t))"i5//K((t-x)/h)T(y-m(t))dZn(x,y), 

1  n 

where  Tn  =  (|y|<  an),  a(t)  =  E(V2(y-m(t) )*I( |y|<  afl) |X=t) »fx( t) 

Yl,n(t)  =  (h9(t))"*//K((t-x)/hMy-ni(t))  dBp(T(x,y)), 

n 

(Bn )  being  the  sequence  of  Brownian  bridges  from  Lemma  2.1. 

Y2,n(t)  =  (Mt))~^//K((t-x)/hMy-rn(t)}  dWn(T(x,y)) 

n 

(Wn)  being  the  sequence  of  Wiener  processes  satisfying 
Bn(x\y')  =  Wn(x',  y')  -  x,y'Wn(l  ,1) 
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Y3  n(t)  =  (hg(  t)  )_>2//K((t-x)/h)v(y-ni(x) )  dWn(T(x,y)) 

;  n 

Y4>n(t)  =  (hg( t) ) "lj/g(x)l‘K(  ( t-x)/h )dW(x) 

Y5,n(t)  =  <  t-x)/h)dW(x) , 

i W  ( • )  ■  being  the  Wiener  process  on  (-“,-)• 

Lemmata  3.2  to  3.7  ensure  that  all  these  processes  have  the  same  limit  distribu 

tions.  The  results  then  follows  from  the  following  lemma 

Lemma  2.2  (Bickel  and  Rosenblatt  (1973)).  Let  d  ,  A(K),  e  as  in  the  th  em. 

Let 

Y5,n(t)  =  h‘S/K((t-x)/h)  dW ( x ) . 

Then 

P((2."  log  n)*5;  sup  |  Y-  _(t)  j/[\(K)]*s  -  d;  <  x)  +  e'2e 
0<t--l  b,n  n 


3.  PROOFS 

We  show  first  that  (j  R  ((  =  sup  !r  (t)(  vanishes  asymptotical  ly  with  the 

n  0<t<l  n 

desired  rate  (nh  log  n)  2. 

Lemma  3.1:  For  the  remainder  term  ( t )  defined  in  (2.2)  we  have 


(3.1) 


Rnli  =  Op( (nh  log  n}~’2)  . 


Proof:  First  we  have  by  the  positivity  of  the  kernel  K  and  |T"!  < 
i;  Rn I!  '■  t  infi(!Dn(t)H(t)!r,{|iHn||  •  ||q-0j|*||0n||  •  II  EH„!I! 


+  C1  ’ll  mn-m| 


[  inf  | D  (t)  | ] 


-1 


Ostsl 


-1  n 

where  f  =  (nh)  j  K((x-X.)/h). 

i  =  l  1 

The  desired  result  (3.1)  will  then  follow  if  we  prove  the  following: 


Hnll  =  o  (n'V’Mlog  n )'^) 


(3.2) 


(3.2) 


(3.3) 

(3.4) 
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q'Dn  1  =  °p(n’lah';'1(log  n)'  2) 

!  FHn  •=  u(h2) 

(3.5)  !imn-m|!2=  op(  (nh)~ -’( log  n)"'2). 

Define  t)  -  n'V4(log  n)  2[Hn(t ) -EHn(t) ] . 

We  first  show  that  U  (t)  ^  0  for  all  t.  This  follows  from  Markov's  inequality 
since 

n 

U  (t)  =  v  U.  (t)  , 
n'  i  ,n 

where  U.>n(t)  =  n"3/4h‘3/4(  1  oq  n)V(  U-X.  )/h)M-  ( Y.-m(t)  )-EK(it-X)/h)  • :  (y-m(  t) )  ] , 
are  iid  rv's  and  thus 

P(  IUn(t)  ;•-•)  •  -'W^log  n)  •h'1EK2(  (t-X)/h)i'2(  Y-m(t) ) . 

The  RHS  of  this  inequality  tends  to  zero  since 

h~1EK2((t-X)/h) • 2(Y-m(t) )  =  h~1/K2((t-u)/h)E(T2(y-m(t))!X=u)fx(u)du 

~  o2(t)*fx(t)*jK2(u)du 

2 

by  continuity  of  •  (t)  and  fx(t). 

Next  we  show  the  tiqhtness  of  Un(t)  using  the  following  moment  condition 
(Billingsley,  1968,  Th.  15.6) 

E 1  un(t)-un(t1 ) ! * jun(t2)-un(t)  | >  <  c2-(t2-t])2 

where  C2  is  a  constant. 

By  the  Schwarz  inequality, 

e: :un(t)-un(t1 ) | - |un(t2)-un(t) !  > 

JE[Un(t)-Un(t1)]2.  E[Un(t2)-Un(t)]2}J5  . 

2 

It  suffices  to  consider  only  the  term  E {U  ( t ) -U  (t,)l  . 

n  n  1 

Using  the  Lipschitz  continuity  of  K.'dm  and  assumption  (A2)  we  have 
fE[Un(  t)-Un(  t7 )  l2}*5 
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•  (log  n)(nh)"3/2  •  F[A+B]2 

■  CA(nh)"^’(log  n)  2  t-t^  '  +  (^(n’^h'^Oog  n)  2*  t-t^  ■  •  C^*  t-t^ 
n 

where  A  =  '•  K(  ( t-X  .  )/h)  [ :  ( Y . -m(  t)  )-'^(  Yi -m( ) )  ] 
i  =  1 

n 

G  =  V  (Y.-m(t1))[K((t1-X.)/h)-K((t-X.)/h)], 
i  =  1  11  1 

and  C^,  Cp  are  Lipschitz  bounds  for  - ,  m,  K. 

Since  (3.4)  follows  from  the  well-known  bias  calculation 

EHn(t)  =  h_1/K((t-u)/h)E(  :  (y-in(t))  X=u)fx(u)du  =  0(h2), 

where  0(h2)  is  independent  of  t  (Parzen,  1962)  we  have  from  assumption  (A2) 

that  EH  i  =  o( (nh)"^(log  n)  2) . 

Statement  (3.2)  thus  follows  using  tightness  of  U  ( t )  and  the  inequality 

H  ■  |!  H  -EH  /  +  :!  EH  . 

1  n  n  n  n 

Statement  (3.3)  follows  in  the  same  way  as  (3.2)  using  assumption  (A2) 
and  the  continuity  properties  of  K,’i  1  ,m. 

Finally  from  Hardle  and  Luckhaus  (1982),  where  uniform  continuity  of 
mn(t)-m(t)  is  shown,  we  have 

mn-m‘!  -  0p( ( nh) " 2( 1 og  n)'2) , 
which  implies  (3.5)  . 

Now  the  assertion  of  the  lemma  follows  since  by  tightness  of  DnU), 

inf  i D  ( t ) !  — •  q  and  thus 
0  t-  1  n  p  0 

II  RJ|  =  op((nh)-i3(log  n)"l2)(l  +  ||  fj|  ). 

Finally  by  Theorem  3.1  of  Bickel  and  Rosenblatt  (1973)  ]|  f n | [  =  0p(l)  , 
thus  the  desired  result  j|  R  j|  =  o  ( (nh) ’  2( log  n)  2)  follows.  In  the  nonrobust 
case,  i.e.  i'(  u)  =  u,  the  remainder  term  Rn  reads 

(3.6)  R„  •  [mj  -  E(«„-rafn)/fx  , 

-1  11 

where  m  (x)  =  (nh)  T  K( (x-X. )/h)Y. . 
n  i=l  1  1 
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Johnston  (1982)  proved  that  (iii  -E  mn)/f  has  the  desired  asymptotic 
distribution  as  stated  in  our  Theorem. 

So  if  we  apply  the  recent  result  of  Mack  and  Silverman  (1982)  or  Hardle 
and  Luckhaus  (1982)  to  jj  m*-m||  and  the  well  known  result  from  Bickel  and 
Rosenblatt  (1973)  to  [(  f^-f^jj  we  may  conclude  that  the  first  term  on  the  RHS 
of  (3.6)  is  o  ((nh)  2(log  n)  2) .  The  second  term  in  (3.6)  is 

[h_1/K( (t-u)/h) *m(u)f(u)du  -  m(t)h~^/K( (t-u)/h)f(u)du]/fx(t) 

which  is  by  the  same  calculations  as  mentioned  above  (Parzen,  1962)  of  the 
2 

order  0(h  ).  This  shows  that  our  result  generalizes  Johnston's  paper.  Our 

theorem  says  also  that  the  confidence  bounds  are  smaller.  Johnston  had 

2  2 

s  (t)  =  E(Y  |X=t)  as  a  factor  for  the  asymptotic  confidence  bound,  we  have 

2  2 
(t)  =  var(Y|X=t)  which  is  in  general  smaller  than  s  (t).  We  now  begin 

with  the  subsequent  approximati ons  of  the  processes  Yg  n  to  Y5  n  • 

Lemma  3.2 : 

!!  Y0,n~Yl  ,n'l  =  °((nh)'h(log  ")2)  a-s- 

Proof:  Let  t  be  fixed  and  put  L(y)  =  't'(y-m(t))  still  depending  on  t. 

Use  integration  by  parts  and  obtain: 

/ /L(y )K( ( t-x)/h)dZ  (x  ,y )  = 

r  11 

n 

A  a 

=  /  /  L(y)K(u)dZ  (t-h*u,y)  = 

u=-A  y=-an 

A  an  A 

=  I  I  zn{t-h  *u>y)d[L(y)K(u)]  +  L(a  )/  Z  (t-h*u,a  tdK(u) 

-A  -an  n  n  -A  n  n 

A  a 

-L(-an)  / Zn(t-h-u,  -an)dK(u)  +  K(A)[  /nZn(t-h.u,y)dL(y) 

-A  -an 

*L(a„)Z  (t-h.A,an)-L(-an)Zn(t-h.fl,-an)] 
an 

-K(-A)[  /  Zn(t+h.A,y)dL(y)  +  L(an)Zn(t+h. A,an) 

"an 

-L(-an)Zn(t+h*A,-an)]  . 
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If  we  apply  the  same  operations  to  Y^  n  with  Bn(T(x,y))  instead  of  Zn(x,y) 
and  use  Lemma  2.1  we  finally  obtain 


sup  h^  q(t)S|Y^  -  Y,  ( t )  1  =  0({nh)"2(loq  n)2) 
0-t-l  u,n  ’ 

using  the  differentiability  and  boundedness  of 


a.s . 


Lemma  3.3: 


Y,  -  Y„  ||  =  0  (h2) 
l,n  2,nM  pv  ' 


Proof:  Note  that  the  Jacobi  of  T(x,y)  is  f(x,y)  hence 

'-h{ /•' 

n 


! Y i  n(t)  -  Y2  n(t)  |  =  !  (q(t)h)'^//-:(y-m(t)K((t-x)/h)f(x,y)dxdy ;  •  |Wp(l  ,1 ) 

It  follows  that 
h',-'!i  Y 


- h . 


-1 


1  n~  Y°  n  II  ■  |Wn^  )  1  *  H  "II  *  sup  h'  Jf\,  (y-m(  t ) )  K(  ( t-x )/ h)  j  f  ( X  ,y )  dxdv 

i,n  c,n  n  o<t<l  i’_ 

L 


n 


Since  j|  g  1  \\  is  bounded  by  assumption  and  >.  is  bounded  we  have 

« 

'4 

Lemma  3.4: 


h_,<ii  Yi,n-Y2,n!l  ^iwn<1  i  *  c4*h"1/0<((t-x)/h))dx  =  0p(l). 


i  Y,  -Y,  ;;  =  0  (h*5) 

1  2,n  3,n"  p  1 

Proof:  The  difference  |Y„  (tl-Y  (t)|  may  be  written  as 

t  o  $  1 1 

|  (q(t)h)"1'2/ /[>(y-m(t))-i|(y-m(x  )]K((t-x)/h)dW  (T(x,y))  | 

l i '  1 


If  we  use  the  fact  that  i|',m  are  uniformly  continuous  this  is  smaller  than 


h"Js|q(t)r,5-Op(h) 


and  the  lemma  thus  follows. 
Lemma  3.5: 


V4,n-V5,n"- 


Proof: 


|Y4,n(t)-Y5,n(t)|=  h"*U  -1  W  (t-x)/h)dW(x)  | 


J 
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h ~h]J  w(t-hu)  'u  (rq(^^)  lb  -  II 
+  h~^K(A)W(t-hA){q(^-A-y^  -  1 }  I 

+  h~1'ilK(-A)W(t+hA){[g(^jA)]is  -  1 }  | 

=  Sl,n(t)  +  S2,n(t)  +  S3,n(t)  •  say* 

The  second  term  can  be  estimated  by 

s?  n'l  -  K(A>‘  sup  1  W( t-Ah )  j  •  sup  h'1  ]**  -  1 )  1 

0<t<l  0<t<l 

by  the  mean  value  theorem  it  follows  that 

h‘l>l|S2nll=  0p(l). 

The  first  term  S,  is  estimated  as  follows. 

1  ,n 

h"1S1  >n(t)  =  Ih'1/  W(t-uh){K'(u)([Sl|^bl  fi  .  1}  du 

-hj  w( t-uh ) k( u) ' ^--f)]du i 
“A  J 

■  lTl,n<*>  -  T2,„<t>l  ’  ’•*. 

fl  2 
H  T„  ||  <  CL  •  /  | W( t-hu) | du  =  0(1)  by  assumption  on  g(t)  =  o  ( t ) -f  (t). 

£ » n  0  r  A 

To  estimate  T-,  wr  again  use  the  mean  value  theorem  to  conclude  that 
1 9  n 

sup  b"1  -  1 1  <  c6  *  I u  I 

0-^1  ; 

hence 

A 

II  T-i  n  II  -  Cg*  sup  /  |W(t-hu)K'(u)u|du  =  0(1). 

1  D  0<t<l  -A  p 

Since  S,  (t)  is  estimated  as  S9  (t)  we  finally  obtain  the  desired  result. 

•j  9  n  l  j  n 

The  next  lemma  shows  that  the  truncation  introduced  through  {an >  does  not 
affect  the  limiting  distribution. 


Lemma  3.6: 


ilVYJi-°p<(l05n)',‘)- 

Proof:  We  shall  only  show  that  g'(t)’i5h"i5  JJ  ij>(y-m(t)  )K(  (t-x)/h)dZ  (x,y) 

IR-r  n 

fulfills  the  lemma. 

The  replacement  of  q'(t)  by  g(t)  may  be  proved  as  in  Johnston  (1932).  The 

quantity  above  is  less  than  h^5  |[  g^3 1|  •  ||  fj  'T»(y-m(*))K((--x)/h)dZ(x,y)  || 

{ |y l>an} 

It  remains  to  show  that  the  last  factor  tends  to  zero  at  a  rate  Op((loq  n)  2  ) . 
We  show  first  that 

vn(t)  =  (log  //  <Hy~m(t))K((t-x)/h)dZ  (x,y) 

n  {|y|>an} 

*  0  for  all  t 

and  then  we  show  tightness  of  vn(t)>  the  result  then  follows. 

Vn(t)  =  (log  n)J'2(nh)"'2  W^(Y.-m(t)I{  |y  |>a  }(Yi  )K((t-X.  )/h) 

-  EMY.-m(t)).I{|y|>a  }(Y.)K((t-Xi)/h)} 

=  l  Xn  i(t) 
i  =  l 

where  fX  . (t)}?  ,  are  iid  for  each  n  with  EX  .  (t)  =0  for  all  t  ?  [0,1]. 

VI  9  I  I  *"  I  n  j  I 

We  have  then 

EX^.(t)  <  (log  n)(nh)"1E/(Y.-m(t))I{|y|>a  }(Y.  )K2(  (t-X- )/h) 

<  sup  X2(u).(log  n)(nh)_1E^2(Y.-  m( t) ) I f , (Y. ) 

-A<u<A  1  1|y|>V  1 

hence 

,ar(Vn(t))  ■E(j1«n.1(t))Z'"-E*S,l(t) 

«  sup  K2( u)h_1  (log  n)  /  f  (y)dyM 
-A<u<A  { |y ! >an)  y  ^ 

where  denotes  an  upper  bound  for  ij;2. 


This  term  tends  to  zero  by  assumption  (A3).  Thus  by  Markov's  inequality  we 
conclude  that 


Vn(t)  — 0  for  all  t  e  [0,1]. 

To  prove  tightness  of  { Vn ( t ) }  we  refer  again  to  the  following  moment  condition 
as  stated  in  Lemma  3.1. 

E{lVn(t)  -  Vn(tl)|*|Vn(t2)-Vn(t)|}  s 
C1  denoting  a  constant,  t  e  [t^.tj]. 

We  again  estimate  the  left  hand  side  by  Schwarz's  inequality  and  estimate  each 
factor  separately. 


E[Vn(t)-Vn(t1)]2  =  (log  n)(nhr]E{  "^(t.t^X^Y.)-^  ^  ^(Y.)- 

-  E(Y  (t,t  X  Y  )-I  (Y.))}2  , 

n  1  1  1  {|y|>y  1 

where  Tp(t .t, ,Xi ,Y. )  =  *( Y1-m( t))K( (t-X. )/h)-^(Y. -mU, ) )K( (t]-X] )/h) 

Since  iji,m,K  are  Lipschitz  continuous  it  follows 
{E[Vn(t)-Vn(t1)]2}Jj 

<  C (log  n)33h'3/2|t-t.  |*{  /  f  (yjdy}*5 

7  1  (|y|>an)y 

If  we  apply  the  same  estimations  to  Vn(t2)-Vn(t-j )  we  finally  have 
E{ I Vn(t)-Vn(ti ) 1 ' ! Vn(t2)-Vn(t) | }  s  C2(log  n)h'3|t-t1 ||  t2-t| 


Lemma  3.7: 


•  ,  ,/ 

(|y|>ar 


} 


fy(y)dy 


<  C ’  - 1 t2~t-|  | 2  since  teCt^t^]  . 
by  assumption  (A3). 

Let  A(K)  =  /K2(u)du  and  let  {dp}  as  in  the  theorem. 
(2<5  log  n)S5[||Y3>n||/[A(K)]S5  -  dfl] 


Then 


has  the  ‘  ame  asymptotic  distribution  as 

(2«  log  n)**[||  Y4#n!l  /Cx(K)3,s  -  dn] 
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Proof:  Vo  (t)  is  a  Gaussian  process  with 
o )  n 

EY3>n(t)  -  0 
and  covariance  function 
r3(t-|  .t2)  =  EY3)n(ti)Y3jn(t2^ 

=  [g(t1  )g(t2)]"i5h“1///(y-m(x))K({t1-x)/h)K((t2-x)/h)f(x,y)dxdy. 
rn 

=  h  1  [g(t1  )g(t2)]'i5//^2(y-m(x))f(y  |x)dyK((t-j-x)/h)K((t2-x)/h)fx(x)dx 
1  n 

=  h"1[g(t1)g(t2)]"!5/g(x)K((t1-x)/h)K((t2-x)/h)  dx 

=  r4(t-|,t2)  the  covariance  function  of  the  Gaussian  process  n(t),  which 
proves  the  lemma. 
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